NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Controlling Neural Collapse Enhances Out-of-Distribution Detection and Transfer Learning

Harun, MY; Gallardo, J; Kanan, C (July 2025, Proc. International Conference on Machine Learning (ICML))

Out-of-distribution (OOD) detection and OOD generalization are widely studied in Deep Neural Networks (DNNs), yet their relationship remains poorly understood. We empirically show that the degree of Neural Collapse (NC) in a network layer is inversely related with these objectives: stronger NC improves OOD detection but degrades generalization, while weaker NC enhances generalization at the cost of detection. This trade-off suggests that a single feature space cannot simultaneously achieve both tasks. To address this, we develop a theoretical framework linking NC to OOD detection and generalization. We show that entropy regularization mitigates NC to improve generalization, while a fixed Simplex ETF projector enforces NC for better detection. Based on these insights, we propose a method to control NC at different DNN layers. In experiments, our method excels at both tasks across OOD datasets and DNN architectures.
more » « less
Free, publicly-accessible full text available July 13, 2026
What Variables Affect Out-of-Distribution Generalization in Pretrained Models?

Harun, M Y; Lee, K; Gallardo, J; Krishnan, G; Kanan, C (December 2024, Neural Information Processing Systems (NeurIPS))

Embeddings produced by pre-trained deep neural networks (DNNs) are widely used; however, their efficacy for downstream tasks can vary widely. We study the factors influencing transferability and out-of-distribution (OOD) generalization of pre-trained DNN embeddings through the lens of the tunnel effect hypothesis, which is closely related to intermediate neural collapse. This hypothesis suggests that deeper DNN layers compress representations and hinder OOD generalization. Contrary to earlier work, our experiments show this is not a universal phenomenon. We comprehensively investigate the impact of DNN architecture, training data, image resolution, and augmentations on transferability. We identify that training with high-resolution datasets containing many classes greatly reduces representation compression and improves transferability. Our results emphasize the danger of generalizing findings from toy datasets to broader contexts.
more » « less
Full Text Available
Human Emotion Estimation through Physiological Data with Neural Networks

Gallardo, J; Savur, C; Sahin, F; Kanan, C (June 2024, IEEE)

Effective collaboration between humans and robots necessitates that the robotic partner can perceive, learn from, and respond to the human's psycho-physiological conditions. This involves understanding the emotional states of the human collaborator. To explore this, we collected subjective assessments - specifically, feelings of surprise, anxiety, boredom, calmness, and comfort — as well as physiological signals during a dynamic human-robot interaction experiment. The experiment manipulated the robot's behavior to observe these responses. We gathered data from this non-stationary setting and trained an artificial neural network model to predict human emotion from physiological data. We found that using several subjects' data to train a general model and then fine-tuning it on the subject of interest performs better than training a model only using the subject of interest data.
more » « less
Full Text Available
SIESTA: Efficient Online Continual Learning with Sleep

Harun, Md Y.; Gallardo, J.; Hayes, Tyler L.; Kemker, Ronald; Kanan, Christopher (November 2023, Transactions on Machine Learning Research)

Full Text Available
How efficient are today’s continual learning algorithms?

Harun, M.Y.; Gallardo, J.; Hayes, T.L.; Kanan, C. (July 2023, CVPR Workshop on Continual Learning in Computer Vision (CLVISION))

Supervised Continual learning involves updating a deep neural network (DNN) from an ever-growing stream of labeled data. While most work has focused on overcoming catastrophic forgetting, one of the major motivations behind continual learning is being able to efficiently update a network with new information, rather than retraining from scratch on the training dataset as it grows over time. Despite recent continual learning methods largely solving the catastrophic forgetting problem, there has been little attention paid to the efficiency of these algorithms. Here, we study recent methods for incremental class learning and illustrate that many are highly inefficient in terms of compute, memory, and storage. Some methods even require more compute than training from scratch! We argue that for continual learning to have real-world applicability, the research community cannot ignore the resources used by these algorithms. There is more to continual learning than mitigating catastrophic forgetting.
more » « less
Full Text Available

Search for: All records